Overview

Dataset Statistics

Number of Variables 10
Number of Rows 18641
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 4.4 MB
Average Row Size in Memory 248.5 B
Variable Types
  • Numerical: 5
  • Categorical: 4
  • DateTime: 1

Dataset Insights

id is uniformly distributed Uniform
extra_features_count is skewed Skewed

Variables


id

numerical

Approximate Distinct Count 18641
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 298256
Mean 10036.7556
Minimum 0
Maximum 20097
Zeros 1
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • id is uniformly distributed
  • id is skewed right (γ1 = 0.0025)

Quantile Statistics

Minimum 0
5-th Percentile 1003
Q1 5033
Median 10036
Q3 15034
95-th Percentile 19095
Maximum 20097
Range 20097
IQR 10001

Descriptive Statistics

Mean 10036.7556
Standard Deviation 5800.7978
Variance 3.3649e+07
Sum 1.871e+08
Skewness 0.002455
Kurtosis -1.1954
Coefficient of Variation 0.578

make

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 1329091
  • The largest value (Nissan) is over 39.09 times larger than the second largest value (Nissan Motor Egypt)

Length

Mean 6.2993
Standard Deviation 1.8715
Median 6
Minimum 6
Maximum 18

Sample

1st row Nissan
2nd row Nissan
3rd row Nissan
4th row Nissan
5th row Nissan

Letter

Count 116496
Lowercase Letter 96925
Space Separator 930
Uppercase Letter 19571
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Nissan, Nissan Motor Egypt) take over 50.0%
  • The largest value (nissan) is over 40.09 times larger than the second largest value (egypt)

model

categorical

Approximate Distinct Count 6
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 1311526
  • The largest value (Sunny) is over 4.58 times larger than the second largest value (Qashqai)

Length

Mean 5.3571
Standard Deviation 0.7892
Median 5
Minimum 4
Maximum 14

Sample

1st row Juke
2nd row Juke
3rd row Juke
4th row Juke
5th row Juke

Letter

Count 99859
Lowercase Letter 81216
Space Separator 2
Uppercase Letter 18643
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Sunny, Qashqai) take over 50.0%
  • The largest value (sunny) is over 4.58 times larger than the second largest value (qashqai)

model_year

numerical

Approximate Distinct Count 24
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 298256
Mean 2016.3775
Minimum 1999
Maximum 2023
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • model_year is skewed left (γ1 = -1.0509)

Quantile Statistics

Minimum 1999
5-th Percentile 2008
Q1 2014
Median 2017
Q3 2020
95-th Percentile 2022
Maximum 2023
Range 24
IQR 6

Descriptive Statistics

Mean 2016.3775
Standard Deviation 4.3211
Variance 18.6716
Sum 3.7587e+07
Skewness -1.0509
Kurtosis 1.6756
Coefficient of Variation 0.002143
  • model_year is not normally distributed (p-value 0.0007569875437005955)
  • model_year has 312 outliers

kilometers

numerical

Approximate Distinct Count 576
Approximate Unique (%) 3.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 298256
Mean 94673.4225
Minimum 0
Maximum 285000
Zeros 237
Zeros (%) 1.3%
Negatives 0
Negatives (%) 0.0%
  • kilometers is skewed right (γ1 = 0.2882)

Quantile Statistics

Minimum 0
5-th Percentile 9999
Q1 43000
Median 90000
Q3 139999
95-th Percentile 200000
Maximum 285000
Range 285000
IQR 96999

Descriptive Statistics

Mean 94673.4225
Standard Deviation 59974.6656
Variance 3.597e+09
Sum 1.7648e+09
Skewness 0.2882
Kurtosis -0.7541
Coefficient of Variation 0.6335

transmission_type

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 1376242
  • The largest value (Automatic) is over 16.52 times larger than the second largest value (Manual)

Length

Mean 8.8288
Standard Deviation 0.696
Median 9
Minimum 6
Maximum 9

Sample

1st row Automatic
2nd row Automatic
3rd row Automatic
4th row Automatic
5th row Automatic

Letter

Count 164577
Lowercase Letter 145936
Space Separator 0
Uppercase Letter 18641
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Automatic, Manual) take over 50.0%
  • The largest value (automatic) is over 16.52 times larger than the second largest value (manual)

price

numerical

Approximate Distinct Count 793
Approximate Unique (%) 4.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 298256
Mean 274339.0913
Minimum 10000
Maximum 1384000
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • price is skewed right (γ1 = 1.6013)

Quantile Statistics

Minimum 10000
5-th Percentile 129000
Q1 181000
Median 248000
Q3 337000
95-th Percentile 511000
Maximum 1384000
Range 1374000
IQR 156000

Descriptive Statistics

Mean 274339.0913
Standard Deviation 128450.0286
Variance 1.6499e+10
Sum 5.114e+09
Skewness 1.6013
Kurtosis 4.4929
Coefficient of Variation 0.4682
  • price is not normally distributed (p-value 4.9845777739240734e-05)
  • price has 614 outliers

priced_at

datetime

Distinct Count 226.3906
Approximate Unique (%) 1.2%
Missing 0
Missing (%) 0.0%
Memory Size 298256
Minimum 2022-02-02 00:00:00
Maximum 2023-04-30 00:00:00

mileage_category

categorical

Approximate Distinct Count 5
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 168262

Length

Mean 7.3682
Standard Deviation 1.7262
Median 8
Minimum 5
Maximum 9

Sample

1st row 200k+
2nd row 200k+
3rd row 0-50k
4th row 100k-150k
5th row 0-50k

Letter

Count 31052
Lowercase Letter 31052
Space Separator 0
Uppercase Letter 0
Dash Punctuation 17664
Decimal Number 87658
  • The top 2 categories (50k-100k, 0-50k) take over 50.0%

extra_features_count

numerical

Approximate Distinct Count 39
Approximate Unique (%) 0.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 298256
Mean 12.4516
Minimum 1
Maximum 39
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • extra_features_count is skewed right (γ1 = 0.7884)

Quantile Statistics

Minimum 1
5-th Percentile 4
Q1 6
Median 9
Q3 18
95-th Percentile 26
Maximum 39
Range 38
IQR 12

Descriptive Statistics

Mean 12.4516
Standard Deviation 7.7909
Variance 60.6976
Sum 232110
Skewness 0.7884
Kurtosis -0.54
Coefficient of Variation 0.6257
  • extra_features_count is not normally distributed (p-value 5.619132873921524e-21)
  • extra_features_count has 33 outliers

Interactions

Correlations

Missing Values